Several Approaches for Tweet Topic Classification in COSET - IberEval 2017

نویسندگان

  • Carlos Villar Lafuente
  • Gonçal Garcés Díaz-Munío
چکیده

These working notes summarize the different approaches we have explored in order to classify a corpus of tweets related to the 2015 Spanish General Election (COSET 2017 task from IberEval 2017). Two approaches were tested during the COSET 2017 evaluations: Neural Networks with Sentence Embeddings (based on TensorFlow) and N-gram Language Models (based on SRILM). Our results with these approaches were modest: both ranked above the “Most frequent baseline”, but below the “Bag-of-words + SVM” baseline. A third approach was tried after the COSET 2017 evaluation phase was over: Advanced Linear Models (based on fastText). Results measured over the COSET 2017 Dev and Test show that this approach is well above the “TF-IDF+RF” baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of the 1st Classification of Spanish Election Tweets Task at IberEval 2017

This paper summarises the COSET shared task organised as part of the IberEval workshop. The aim of this task is to classify the topic discussed in a tweet into one of five topics related to the Spanish 2015 electoral cycle. A new dataset was curated for this task and hand-labelled by experts on the task. Moreover, the results of the 17 participants of the task and a review of their proposed sys...

متن کامل

Ensembles of Methods for Tweet Topic Classification

This paper describes the system we developed for IberEval 2017 on Classification Of Spanish Election Tweets (COSET) task. Our approach is based on a weighted average ensemble of five classifiers: 1) a classifier based on logistic regression; 2) a support vector machine classifier; 3) a Naive Bayes classifier for multinomial models; 4) a Guassian Naive Bayes classifier; and 5) a classifier imple...

متن کامل

ELiRF-UPV at IberEval 2017: Classification Of Spanish Election Tweets (COSET)

This paper describes the participation of ELiRF-UPV team at Classification Of Spanish Election Tweets (COSET) task. We tested several approaches based on different classifiers and features representations. Our main approach is based on neural networks, concretely, Multilayer Perceptrons (MLP) with bag-of-words representation of the tweets. Our system achieved the best score on the test set of t...

متن کامل

Short Text Classification Using Deep Representation: A Case Study of Spanish Tweets in Coset Shared Task

Topic identification as a specific case of text classification is one of the primary steps toward knowledge extraction from the raw textual data. In such tasks, words are dealt with as a set of features. Due to high dimensionality and sparseness of feature vector result from traditional feature selection methods, most of the proposed text classification methods for this purpose lack performance...

متن کامل

Classification Of Spanish Election Tweets (COSET) with Neural Networks

Obtaining information from tweets has become a field of interest in recent years due to its power to provide information about the insights of the users when any relevant event occurs. This is useful for companies and political parties that take advantage of this information in order to plan their next actions or to know whether or not their current actions are being received well by their publ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017